Wide-Area Egomotion from Omnidirectional Video and Coarse 3D Structure
نویسندگان
چکیده
This thesis describes a method for real-time vision-based localization in human-made environments. Given a coarse model of the structure (walls, floors, ceilings, doors and windows) and a video sequence, the system computes the camera pose (translation and rotation) in model coordinates with an accuracy of a few centimeters in translation and a few degrees in rotation. The system has several novel aspects: it performs 6-DOF localization; it handles visually cluttered and dynamic environments; it scales well over regions extending through several buildings; and it runs over several hours without losing lock. We demonstrate that the localization problem can be split into two distinct problems: an initialization phase and a maintenance phase. In the initialization phase, the system determines the camera pose with no other information than a search region provided by the user (building, floor, area, room). This step is computationally intensive and is run only once, at startup. We present a probabilistic method to address the initialization problem using a RANSAC framework. In the maintenance phase, the system keeps track of the camera pose from frame to frame without any user interaction. This phase is computationally light-weight to allow a high processing frame rate and is coupled with a feedback loop that helps reacquire “lock” when lock has been lost. We demonstrate a simple, robust geometric tracking algorithm based on correspondences between 3D model lines and 2D image edges. We present navigation results on several real datasets across the MIT campus with cluttered, dynamic environments. The first dataset consists of a five-minute robotic exploration across the Robotics, Vision and Sensor Network Lab. The second dataset consists of a two-minute hand-held, 3D motion in the same lab space. The third dataset consists of a 26-minute exploration across MIT buildings 26 and 36. We also present a detailed analysis of the system performance along with several failure modes and ideas to address them. Thesis Supervisor: Seth Teller Title: Associate Professor of Computer Science and Engineering 3
منابع مشابه
Reduced egomotion estimation drift using omnidirectional views
Estimation of camera motion from a given image sequence is a common task for multi-view 3D computer vision applications. Salient features (lines, corners etc.) in the images are used to estimate the motion of the camera, also called egomotion. This estimation suffers from an error built-up as the length of the image sequence increases and this causes a drift in the estimated position. In this l...
متن کاملReal-Time Estimation of Fast Egomotion with Feature Classification Using Compound Omnidirectional Vision Sensor
For fast egomotion of a camera, computing feature correspondence and motion parameters by global search becomes highly timeconsuming. Therefore, the complexity of the estimation needs to be reduced for real-time applications. In this paper, we propose a compound omnidirectional vision sensor and an algorithm for estimating its fast egomotion. The proposed sensor has both multi-baselines and a l...
متن کاملSingle-Image Omnidirectional Vision Systems: A Survey of Models, Calibration, and Applications
Single-Image Omnidirectional Vision Systems: A Survey of Models, Calibration, and Applications by Carlos Jaramillo Graduate Center, City University of New York Professor Jizhong Xiao, Advisor The prominent use of omnidirectional vision sensors (ODVS) in various areas of computer vision, such as visual odometry for navigation (egomotion), structure reconstruction (mapping) and localization, tele...
متن کاملFast Intra Mode Decision for Depth Map coding in 3D-HEVC Standard
three dimensional- high efficiency video coding (3D-HEVC) is the expanded version of the latest video compression standard, namely high efficiency video coding (HEVC), which is used to compress 3D videos. 3D videos include texture video and depth map. Since the statistical characteristics of depth maps are different from those of texture videos, new tools have been added to the HEVC standard fo...
متن کاملOmnidirectional texturing of human actors from multiple view video sequences
In 3D video, recorded object behaviors can be observed from any viewpoint, because the 3D video registers the object’s 3D shape and color. However, the real-world views are limited to the views from a number of cameras, so only a coarse model of the object can be recovered in real-time. It becomes then necessary to judiciously texture the object with images recovered from the cameras. One of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007